Workflow and web application for annotating NCBI BioProject transcriptome data

نویسندگان

  • Roberto Vera
  • Newton Medeiros Vidal
  • Gina A. Garzón-Martínez
  • Luz S. Barrero
  • David Landsman
  • Leonardo Mariño-Ramírez
چکیده

Abstract The volume of transcriptome data is growing exponentially due to rapid improvement of experimental technologies. In response, large central resources such as those of the National Center for Biotechnology Information (NCBI) are continually adapting their computational infrastructure to accommodate this large influx of data. New and specialized databases, such as Transcriptome Shotgun Assembly Sequence Database (TSA) and Sequence Read Archive (SRA), have been created to aid the development and expansion of centralized repositories. Although the central resource databases are under continual development, they do not include automatic pipelines to increase annotation of newly deposited data. Therefore, third-party applications are required to achieve that aim. Here, we present an automatic workflow and web application for the annotation of transcriptome data. The workflow creates secondary data such as sequencing reads and BLAST alignments, which are available through the web application. They are based on freely available bioinformatics tools and scripts developed in-house. The interactive web application provides a search engine and several browser utilities. Graphical views of transcript alignments are available through SeqViewer, an embedded tool developed by NCBI for viewing biological sequence data. The web application is tightly integrated with other NCBI web applications and tools to extend the functionality of data processing and interconnectivity. We present a case study for the species Physalis peruviana with data generated from BioProject ID 67621. Database URL: http://www.ncbi.nlm.nih.gov/projects/physalis/.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

BioProject and BioSample databases at NCBI: facilitating capture and organization of metadata

As the volume and complexity of data sets archived at NCBI grow rapidly, so does the need to gather and organize the associated metadata. Although metadata has been collected for some archival databases, previously, there was no centralized approach at NCBI for collecting this information and using it across databases. The BioProject database was recently established to facilitate organization ...

متن کامل

The nuclear and mitochondrial genomes of the facultatively eusocial orchid bee Euglossa dilemma. Authors:

The nuclear and mitochondrial genomes of the facultatively eusocial orchid bee Euglossa dilemma. Authors: Philipp Brand*,†, Nicholas Saleh*,†, Hailin Pan‡, Cai Li‡, Karen M. Kapheim§, Santiago R. Ramírez* Affiliations: * Department for Evolution and Ecology, Center for Population Biology, University of California, Davis, California 95616 † Graduate Group in Population Biology, University of Cal...

متن کامل

BioProject Help

The BioProject resource is a redesigned, expanded, replacement of the NCBI Genome Project resource. The redesign adds tracking of several data elements including more precise information about a project’s scope, material, and objectives. Genome Project identifiers are retained in the BioProject as the ID value for a record, and an Accession number has been added. Other changes include a more fl...

متن کامل

Characterization of the Asian Citrus Psyllid Transcriptome

The Asian citrus psyllid, Diaphorina citri Kuwayama (Hemiptera: Psyllidae) is a vector for the causative agents of Huanglongbing, which threatens citrus production worldwide. This study reports and discusses the first D. citri transcriptomes, encompassing the three main life stages of D. citri, egg, nymph and adult. The transcriptomes were annotated using Gene Ontology (GO) and insecticide-rela...

متن کامل

Transcriptome data of Epinephelus fuscoguttatus infected by Vibrio vulnificus

Vibriosis disease by Vibrio spp. greatly reduced productivity of aquaculture, such as brown-marbled grouper (Epinephelus fuscoguttatus), which is an economically important fish species in Malaysia. Preventive measures and immediate treatment are critical to reduce the mortality of E. fuscoguttatus from vibriosis. To investigate the molecular mechanisms associated with immune response and host-b...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره 2017  شماره 

صفحات  -

تاریخ انتشار 2017